✨ Update logic for HostClaims#3152
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| return nil | ||
| } | ||
|
|
||
| // Update updates a machine and is invoked by the Machine Controller. |
|
|
||
| // Update updates a machine and is invoked by the Machine Controller. | ||
| func (m *HostManager) Update(ctx context.Context) error { | ||
| m.Log.Info("Updating machine") |
There was a problem hiding this comment.
Please find and replace all cases of "machine" here
There was a problem hiding this comment.
And this should be V(1) too unless we detect that actual changes are needed
| return hideConflictError(err) | ||
| } | ||
|
|
||
| // transient rebootAnnotation was successfully transmitted. We can delete it on HostClaim. |
There was a problem hiding this comment.
This is a bit unfortunate: this way it's hard to track when reboot is actually executed. But I don't see a good way out.
There was a problem hiding this comment.
Yes and if the HostClaim patch fails we may even reboot twice in rare cases.
| m.SetConditionHostToFalse( | ||
| metal3api.AssociatedCondition, metal3api.MissingBareMetalHostReason, | ||
| "BareMetalHost associated to the claim not found") | ||
| return errors.New("BareMetalHost not found") |
There was a problem hiding this comment.
I"m curious what the expected remediation is. Is a user supposed to delete and re-create the claim? Or is the controller supposed to provide another BMH?
There was a problem hiding this comment.
In the consumerRef case status.BaremetalHost is reset to nil, so the association logic is performed again (because in the controller introduced later the defer patch will be done as in m3m controller). This was not done here and I have added it.
| } | ||
| err := m.client.Get(ctx, key, &bmh) | ||
| if k8serrors.IsNotFound(err) { | ||
| m.Log.Info("Annotated host not found", "bmh", bmhRef.Name, "bmhNamespace", bmhRef.Namespace) |
| if !consumerRefMatches(bmh.Spec.ConsumerRef, hostClaim) { | ||
| m.Log.Info("The consumer ref does not point to the hostClaim", "consumerRef", bmh.Spec.ConsumerRef) | ||
| hostClaim.Status.BareMetalHost = nil | ||
| return nil, ErrNoBMH |
There was a problem hiding this comment.
Similar question here: what's the expected action in this case? Especially since we only provide a generic error message (which is understandable)?
| sourceRef *corev1.SecretReference, | ||
| namespace string, | ||
| hostName string, | ||
| ) (*corev1.SecretReference, error) { |
There was a problem hiding this comment.
This call should exit early if the target namespace is the same as the source namespace (a non-multi-tenant case).
| conditions.SetMirrorCondition(bmh, m.HostClaim, metal3api.ProvisionedCondition) | ||
| m.SetConditionHostToTrue(metal3api.AssociatedCondition, metal3api.BareMetalHostAssociatedReason) | ||
|
|
||
| if !equality.Semantic.DeepEqual(m.HostClaim.Status, hostOld) { |
There was a problem hiding this comment.
And this is where logging something would actually be useful
| bmh.Annotations = map[string]string{} | ||
| } | ||
|
|
||
| if syncReboot(m.HostClaim.Annotations, bmh.Annotations) { |
There was a problem hiding this comment.
We should also sync the detachment annotation (also maybe without the "force" flag being introduced in #2955 - i.e. sanitize the value). Detaching is commonly used in my world (e.g. by ACM) to prevent Ironic from further touching the deployed instance.
There was a problem hiding this comment.
@dtantsur I have not modified this part yet. I want to be sure that we want to give control of the Detach annotation to the customer.... An analogy of hostclaim is taking a taxi rather than renting/owning a car: you do not have the keys. With detach, you ask the taxi driver to throw away the keys.... but you still do not have them. As it is done, I do not see a way of sharing the right to detach by both the customer and the infra owner (no consumerOverride).
There was a problem hiding this comment.
We could have a knob to enable/disable it per HDP. But it's quite important for users to be able isolate their instance from Ironic's power management. It also allows reducing the load on Ironic.
I guess we'd need to re-attach on HostClaim deletion.
There was a problem hiding this comment.
I have added a way to authorize Detach in the HostDeployPolicy. (At least one matching policy must have the allowsDetachedMode flag set). I have also added a way to let admin detach the BMH independently of the HostClaim with a manager flag.
I still do not understand how it can really be used in practice: how is the user managing the power setting without access to the BMC ? He can reboot or halt through the OS, but the first should not interfere with Ironic and if he uses the second, he will need Ironic to power on again.
There is some kind of sanitization and the way it is done, it should "transparently" leaves off force flag when it is merged.
I would keep it as a separate revertible commit for all the reasons above.
| } | ||
|
|
||
| if err != nil { | ||
| m.SetConditionHostToFalse( |
There was a problem hiding this comment.
Maybe don't update the condition on conflict?
Update ensures the synchronization between HostClaim and BareMetalHost. * Copy HostClaim secrets to BareMetalHost namespace * Ensure propagation of reboot annotations * Synchronize hostclaim status (mirrors BareMetalHost conditions) Co-authored-by: Pierre Crégut <pierre.cregut@orange.com> Co-authored-by: Laurent Roussarie <laurent.roussarie@orange.com> Signed-off-by: Pierre Crégut <pierre.cregut@orange.com>
af309eb to
8a95ab0
Compare
cbdae05 to
a268557
Compare
When a HostClaim has a detached annotation, we propagate it to the BareMetalHost iff there is a HostDeployPolicy compatible with the claim that allows such a propagation. We add a manager flag in the "arguments" of the annotation. HostClaim do not set it to true and if the annotation does not appear in the HostClaim, it is not removed from the BareMetalHost if the manager field is set to true. This way the administrator can still force detachment. Signed-off-by: Pierre Crégut <pierre.cregut@orange.com>
a268557 to
b53e9dc
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates HostClaim/BareMetalHost synchronization logic, including secret propagation across namespaces, reboot annotation propagation, detached-mode handling gated by HostDeployPolicy, and mirroring BareMetalHost conditions into HostClaim status.
Changes:
- Add HostManager
Update()flow to synchronize BMH spec (secrets/image/custom deploy/online), reboot annotations, detached annotation, and HostClaim status. - Introduce
allowsDetachedModein HostDeployPolicy API + CRDs to gate HostClaim-driven detached mode. - Expand HostClaim manager unit tests to cover spec syncing, reboot propagation, detached sync, and status mirroring.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/hostclaim/hostclaim_manager.go | Implements HostClaim↔BMH synchronization (secrets, reboot/detach annotations, status mirroring) and refactors policy evaluation. |
| pkg/hostclaim/hostclaim_manager_test.go | Adds test coverage for new sync behaviors (setBmhSpec, reboot propagation, detached sync, status updates, Update()). |
| internal/testutil/common.go | Fixes builder typo (SetNetworkData) and adds HostDeployPolicy builder helper for detached mode. |
| apis/metal3.io/v1alpha1/hostdeploypolicy_types.go | Adds AllowsDetachedMode to HostDeployPolicySpec. |
| config/base/crds/bases/metal3.io_hostdeploypolicies.yaml | CRD schema update for allowsDetachedMode. |
| config/render/capm3.yaml | Rendered CRD/schema update for allowsDetachedMode. |
| apis/metal3.io/v1alpha1/baremetalhost_types.go | Extends detached-annotation args with manager flag to prevent HostClaim from deleting admin-set annotations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return false, err | ||
| } | ||
| inputArgs := metal3api.DetachedAnnotationArguments{} | ||
| _ = json.Unmarshal([]byte(annotValue), &inputArgs) |
There was a problem hiding this comment.
syncDetached() ignores JSON parse errors for the HostClaim detached annotation (_ = json.Unmarshal(...)). If the value is malformed, this will silently treat it as empty args and still write an annotation to the BMH. Handle the unmarshal error and return it (or at least skip syncing) to avoid unexpected behavior from invalid user input.
| _ = json.Unmarshal([]byte(annotValue), &inputArgs) | |
| if err := json.Unmarshal([]byte(annotValue), &inputArgs); err != nil { | |
| return false, err | |
| } |
| // a HostClaim. | ||
| PausedAnnotationValue = "metal3.io/hostclaim" | ||
| // rebootDomain is the domain of metal3 reboot annotations. | ||
| rebootDomain = "reboot.metal3.io" |
There was a problem hiding this comment.
rebootDomain duplicates the existing API constant metal3api.RebootAnnotationPrefix (also "reboot.metal3.io"). Using the shared constant avoids drift and keeps validation/behavior aligned across the codebase.
| rebootDomain = "reboot.metal3.io" | |
| rebootDomain = metal3api.RebootAnnotationPrefix |
| log := m.Log.WithValues("policyNamespace", hostDeployPolicy.Namespace, "policyName", hostDeployPolicy.Name) | ||
| constraints := hostDeployPolicy.Spec.HostClaimNamespaces | ||
| if constraints == nil { | ||
| m.Log.V(1).Info("Ignoring HostDeployPolicy without constraint") |
There was a problem hiding this comment.
In checkPolicy(), the "Ignoring HostDeployPolicy without constraint" log uses m.Log instead of the local log that includes policy name/namespace fields, so the contextual fields are lost.
| m.Log.V(1).Info("Ignoring HostDeployPolicy without constraint") | |
| log.V(1).Info("Ignoring HostDeployPolicy without constraint") |
| Entry("set meta-data/network-data (overide)", testCaseSetBMHSpec{ | ||
| MetaData: NewSecret("s1", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("mdt")}).Build(), | ||
| NetworkData: NewSecret("s2", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("nwdt")}).Build(), | ||
| BMHMetaData: NewSecret("bmh-metadata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | ||
| BMHNetworkData: NewSecret("bmh-networkdata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | ||
| }), | ||
| Entry("reset meta-data/network-data (overide)", testCaseSetBMHSpec{ |
There was a problem hiding this comment.
Spelling: entry title uses "overide"; should be "override" for consistency/readability in test output.
| Entry("set meta-data/network-data (overide)", testCaseSetBMHSpec{ | |
| MetaData: NewSecret("s1", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("mdt")}).Build(), | |
| NetworkData: NewSecret("s2", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("nwdt")}).Build(), | |
| BMHMetaData: NewSecret("bmh-metadata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| BMHNetworkData: NewSecret("bmh-networkdata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| }), | |
| Entry("reset meta-data/network-data (overide)", testCaseSetBMHSpec{ | |
| Entry("set meta-data/network-data (override)", testCaseSetBMHSpec{ | |
| MetaData: NewSecret("s1", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("mdt")}).Build(), | |
| NetworkData: NewSecret("s2", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("nwdt")}).Build(), | |
| BMHMetaData: NewSecret("bmh-metadata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| BMHNetworkData: NewSecret("bmh-networkdata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| }), | |
| Entry("reset meta-data/network-data (override)", testCaseSetBMHSpec{ |
| } | ||
| bmh := bmhBuilder.Build() | ||
| objects = append(objects, hostClaim, bmh) | ||
| // Add secrets if they exists |
There was a problem hiding this comment.
Grammar in comment: "Add secrets if they exists" should be "Add secrets if they exist".
| // Add secrets if they exists | |
| // Add secrets if they exist |
| namespace string, | ||
| hostName string, | ||
| ) (*corev1.SecretReference, error) { | ||
| if namespace == bmh.Namespace { |
There was a problem hiding this comment.
synchronizeDataSecret() will panic when sourceRef is nil and the HostClaim namespace matches the BMH namespace: sourceRef.DeepCopy() dereferences a nil pointer. Return nil, nil (or just nil) when sourceRef == nil in this branch.
| if namespace == bmh.Namespace { | |
| if namespace == bmh.Namespace { | |
| if sourceRef == nil { | |
| return nil, nil | |
| } |
| for _, hdp := range policiesInNs.Items { | ||
| if accept, err := m.checkPolicy(&hdp, hostNs, nsLabels); err != nil { | ||
| return nil, err | ||
| } else if accept { | ||
| compatiblePolicies = append(compatiblePolicies, &hdp) |
There was a problem hiding this comment.
GetCompatiblePolicies() appends &hdp where hdp is the range variable; all returned pointers will reference the same item (the last policy). Iterate by index (e.g., for i := range policiesInNs.Items) or copy into a new variable before taking the address.
| for _, hdp := range policiesInNs.Items { | |
| if accept, err := m.checkPolicy(&hdp, hostNs, nsLabels); err != nil { | |
| return nil, err | |
| } else if accept { | |
| compatiblePolicies = append(compatiblePolicies, &hdp) | |
| for i := range policiesInNs.Items { | |
| hdp := &policiesInNs.Items[i] | |
| if accept, err := m.checkPolicy(hdp, hostNs, nsLabels); err != nil { | |
| return nil, err | |
| } else if accept { | |
| compatiblePolicies = append(compatiblePolicies, hdp) |
| if k8serrors.IsNotFound(err) { | ||
| log.Error(err, "Missing source Secret for synchronization from claim to BMH", "source", sourceKey) | ||
| return nil, RequeueAfterError{RequeueAfter: HostClaimRequeueDelay} | ||
| } |
There was a problem hiding this comment.
When the source Secret is not found, this returns a bare RequeueAfterError, whose message becomes just "requeue in 30s". Consider wrapping it with context about which secret is missing (e.g., include sourceKey in the error) so the HostClaim condition/message is actionable.
| Entry("set meta-data/network-data (overide)", testCaseSetBMHSpec{ | ||
| MetaData: NewSecret("s1", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("mdt")}).Build(), | ||
| NetworkData: NewSecret("s2", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("nwdt")}).Build(), | ||
| BMHMetaData: NewSecret("bmh-metadata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | ||
| BMHNetworkData: NewSecret("bmh-networkdata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | ||
| }), | ||
| Entry("reset meta-data/network-data (overide)", testCaseSetBMHSpec{ |
There was a problem hiding this comment.
Spelling: entry title uses "overide"; should be "override" for consistency/readability in test output.
| Entry("set meta-data/network-data (overide)", testCaseSetBMHSpec{ | |
| MetaData: NewSecret("s1", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("mdt")}).Build(), | |
| NetworkData: NewSecret("s2", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("nwdt")}).Build(), | |
| BMHMetaData: NewSecret("bmh-metadata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| BMHNetworkData: NewSecret("bmh-networkdata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| }), | |
| Entry("reset meta-data/network-data (overide)", testCaseSetBMHSpec{ | |
| Entry("set meta-data/network-data (override)", testCaseSetBMHSpec{ | |
| MetaData: NewSecret("s1", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("mdt")}).Build(), | |
| NetworkData: NewSecret("s2", HostclaimNamespace).SetData(map[string][]byte{"f": []byte("nwdt")}).Build(), | |
| BMHMetaData: NewSecret("bmh-metadata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| BMHNetworkData: NewSecret("bmh-networkdata", "ns").SetData(map[string][]byte{"f": []byte("other")}).Build(), | |
| }), | |
| Entry("reset meta-data/network-data (override)", testCaseSetBMHSpec{ |
What this PR does / why we need it:
Update ensures the synchronization between HostClaim and BareMetalHost.
Checklist: